Insights

Overall admission results


  • 81 applicants.
  • Accepted (~35%), rejected (~30%), and waitlisted (~35%).
    • Average acceptance among U.S.-based colleges in 2021 was 66%. Source.
  • Increased selectivity.

Admission results by gender


@TODO

  • male 44, admit 15 + decline 16 + waitlist 13
  • female 34, admit 12 + decline 12 + waitlist 10

Applicants by geolocations (1)


Census Regions and Divisions

  • Unfeasible to check the data by states.
  • Aggregation by regions.
  • Three states are the major origins of applicants.

Applicants by geolocations (2)


  • No significant difference in quantitative measures of applicants.

Actionable Decisions

Column

Chart 1

Chart 2

About

Left Column Text

The dataset contains 81 valid admission results from the CSV file SummerStudentAdmissions2.csv.

Three versions of this dataset are included on the right hand side:

  • Standardized data.
  • Cleaned data.
  • Raw data.

Due to the lack of information, some of the variables and contents from the dataset are interpreted intuitively.

In the cleaned dataset,

  • gender=-1 means the gender is undisclosed.
  • volunteer_level is ranked from 5 to 0.
  • gpa is calculated on a 4.0 scale.
  • writing_score should be on a 100 scale.
  • test_score has rather limited information.
  • work_exp’s unit is year.

The dashboard is powered by


The codes are open-sourced. Please feel free to star or fork this repository.

Star Fork

Right Column Table

Standardized data

Cleaned data

Raw data

---
title: "Admission Dashboard"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    orientation: columns
    vertical_layout: fill
    social: ["twitter", "linkedin"]
    source_code: embed
    theme: bootstrap
    logo: static/logo.png
    favicon: static/favicon.png
    css: style.css
---

```{r setup-and-data-loading, include=FALSE}
gc()
rm(list = ls())

if (!require("pacman")) install.packages("pacman")
pacman::p_load(
  tidyverse, flexdashboard,
  here, styler, patchwork,
  hrbrthemes, ggthemes, ggtext, plotly,
  glue, waffle, DT, geofacet, ggbeeswarm,
  ggridges, treemapify
)

dat <- read_csv(here("data/data-cleaned.csv"))

style_file("index.Rmd")

# standardize all numeric variables

dat_stand <- dat |>
  mutate(
    decision = as_factor(decision),
    state = as_factor(state),
    gender = as_factor(gender),
    across(where(is.numeric), ~ round(scale(.)[, 1], 2)),
    partition = case_when(
      state %in% c("California", "Colorado", "Utah", "Oregon") ~ "west",
      state %in% c("Vermont", "New York") ~ "northeast",
      TRUE ~ "south"
    )
  )

dat_stand_long <- dat_stand |> pivot_longer(
  cols = c(
    gpa, work_exp, test_score,
    writing_score, volunteer_level
  ),
  names_to = "variable",
  values_to = "value"
)

dat_long <- dat |>
  mutate(partition = case_when(
    state %in% c("California", "Colorado", "Utah", "Oregon") ~ "west",
    state %in% c("Vermont", "New York") ~ "northeast",
    TRUE ~ "south"
  )) |>
  pivot_longer(
    cols = c(
      gpa, work_exp, test_score,
      writing_score, volunteer_level
    ),
    names_to = "variable",
    values_to = "value"
  )
```

Insights {.storyboard data-icon="fa-chart-line" data-commentary-width=200}
===================================== 


### **Overall admission results**

```{r, fig.width=8, fig.height=8}
dat |>
  count(decision) -> admission_summary

p1 <- ggplot(admission_summary, aes(fill = decision, values = n)) +
  geom_waffle(color = "white", size = 1.125, n_rows = 9, flip = TRUE) +
  scale_fill_manual(
    values = c("#1A6899", "#FC5449", "#FFCF58"),
    labels = c("Admit", "Decline", "Waitlist")
  ) +
  coord_equal() +
  theme_ipsum_rc() +
  theme(
    legend.position = "bottom",
    legend.title = element_text(color = "#000000"),
    legend.text = element_text(color = "#000000"),
    axis.title.y = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.title.x = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.text.x = element_blank(),
    axis.text.y = element_blank(),
    text = element_text(
      family = "IBM Plex Sans",
      color = "#3B372E"
    ),
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_markdown(hjust = 0.5),
    plot.caption = element_markdown(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(
    title = "SUMMER 2022 ADMISSION RESULTS",
    subtitle = "BAD DATA EXCLUDED.",
    caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
    x = "",
    y = ""
  )

p1
```

***

- **81 applicants.**
- Accepted (~35%), rejected (~30%), and waitlisted (~35%).
  - Average acceptance among U.S.-based colleges in 2021 was 66%. [Source](https://www.collegedata.com/resources/the-facts-on-fit/understanding-college-selectivity).
- **Increased selectivity.**

### **Admission results by gender**

```{r, fig.width=12,fig.height=8}

dat |>
  select(decision, gender) |>
  mutate(gender = as_factor(gender)) |>
  ggplot(aes(gender)) +
  geom_bar(aes(fill = decision),
    position = position_stack(reverse = TRUE),
    width = 0.2
  ) +
  scale_fill_manual(
    values = c("#1A6899", "#FC5449", "#FFCF58"),
    labels = c("Admit", "Decline", "Waitlist")
  ) +
  scale_x_discrete(labels = c("undisclosed", "female", "male")) +
  theme_ipsum_rc() +
  theme(
    legend.position = "bottom",
    legend.title = element_text(color = "#000000"),
    legend.text = element_text(color = "#000000"),
    axis.title.y = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.title.x = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.text.x = element_text(),
    axis.text.y = element_text(),
    text = element_text(
      family = "IBM Plex Sans",
      color = "#3B372E"
    ),
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_markdown(hjust = 0.5),
    plot.caption = element_markdown(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(
    title = "SUMMER 2022 ADMISSION RESULTS",
    subtitle = "BY GENDER",
    caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
    x = "gender",
    y = "count"
  )
```

*** 

@TODO

- male 44, admit 15 + decline 16 + waitlist 13
- female 34, admit 12 + decline 12 + waitlist 10

### **Applicants by geolocations (1)**

```{r, fig.width=12,fig.height=8}
p3 <- dat_stand |>
  group_by(state, partition) |>
  summarize(count = n()) |>
  ggplot(aes(
    area = count, fill = partition,
    label = count, subgroup = state
  )) +
  geom_treemap() +
  geom_treemap_subgroup_border(color = "white", size = 2) +
  geom_treemap_subgroup_text(
    place = "centre", grow = TRUE,
    alpha = 0.5, colour = "white",
    fontface = "italic"
  ) +
  geom_treemap_text(
    color = "white", place = "bottomright",
    alpha = 0.7, fontface = "bold"
  ) +
  scale_fill_brewer(palette = "Set1") +
  theme_ipsum_rc() +
  theme(
    legend.position = "bottom",
    legend.title = element_text(color = "#000000"),
    legend.text = element_text(color = "#000000"),
    axis.title.y = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.title.x = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    # axis.text.x = element_blank(),
    # axis.text.y = element_blank(),
    text = element_text(
      family = "IBM Plex Sans",
      color = "#3B372E"
    ),
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_markdown(hjust = 0.5),
    plot.caption = element_markdown(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(
    title = "TREEMAP OF APPLICANTS",
    subtitle = "SUMMER 2022",
    caption = glue("SOURCE: SUMMERSTUDENTADMISSION2.CSV"),
    x = "",
    y = ""
  )

# ggplotly(p3)
p3
```

*** 

![Census Regions and Divisions](https://upload.wikimedia.org/wikipedia/commons/thumb/f/f1/Census_Regions_and_Division_of_the_United_States.svg/1280px-Census_Regions_and_Division_of_the_United_States.svg.png){width=100%}

- Unfeasible to check the data by states.
- Aggregation by **regions**.
- Three states are the major origins of applicants.

### **Applicants by geolocations (2)**

```{r, fig.width=12,fig.height=8}
p4 <- dat_long |>
  ggplot(aes(x = value, y = partition, color = partition, fill = partition)) +
  geom_density_ridges(alpha = 0.7) +
  scale_y_discrete(expand = c(0, 0)) + # will generally have to set the `expand` option
  scale_x_continuous(expand = c(0, 0)) + # for both axes to remove unneeded padding
  coord_cartesian(clip = "off") + # to avoid clipping of the very top of the top ridgeline +
  facet_wrap(~variable, scales = "free") +
  scale_fill_brewer(palette = "Set1") +
  scale_color_brewer(palette = "Set1") +
  theme_ipsum_rc() +
  theme(
    legend.position = "bottom",
    legend.title = element_text(color = "#000000"),
    legend.text = element_text(color = "#000000"),
    axis.title.y = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    axis.title.x = element_text(
      family = "IBM Plex Sans", face = "bold"
    ),
    # axis.text.x = element_blank(),
    # axis.text.y = element_blank(),
    text = element_text(
      family = "IBM Plex Sans",
      color = "#3B372E"
    ),
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_markdown(hjust = 0.5),
    plot.caption = element_markdown(),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  ) +
  labs(
    title = "RIDGEPLOT OF APPLICANTS' QUANTITATIVE MEASURES",
    subtitle = "SUMMER 2022",
    caption = glue("NOT ENOUGH NORTHEAST DATA FOR RIDGE PLOT.
SOURCE: SUMMERSTUDENTADMISSION2.CSV"), x = "candidates' standardized measurement", y = "density" ) p4 ``` *** - **No significant difference in quantitative measures of applicants.** Actionable Decisions {data-icon="fa-graduation-cap" data-orientation=columns} ===================================== Inputs {.sidebar data-width=600} ------------------------------------- ```{r} # shiny inputs defined here ``` Column ------------------------------------- ### Chart 1 ```{r} ``` ### Chart 2 ```{r} ``` About {data-icon="fa-info" data-orientation=columns} ===================================== Left Column Text {data-width=350} ----------------------------------------------------------------------- The dataset contains 81 valid admission results from the CSV file `SummerStudentAdmissions2.csv`. Three versions of this dataset are included on the right hand side: - Standardized data. - Cleaned data. - Raw data. Due to the lack of information, some of the variables and contents from the dataset are interpreted intuitively. In the cleaned dataset, - `gender=-1` means the gender is undisclosed. - `volunteer_level` is ranked from 5 to 0. - `gpa` is calculated on a 4.0 scale. - `writing_score` should be on a 100 scale. - `test_score` has rather limited information. - `work_exp`'s unit is year. *** The dashboard is powered by - [`flexdashboard`](https://pkgs.rstudio.com/flexdashboard/) - [`DT`](https://rstudio.github.io/DT/) - [`plotly`](https://plotly.com/) - The static visualization theme is customized based on [`hrbrmstr`](https://github.com/hrbrmstr/hrbrthemes). *** The codes are open-sourced. **Please feel free to star or fork this repository.**

Star Fork

Right Column Table {.tabset data-width=650 data-height=1000} ----------------------------------------------------------------------- ### Standardized data ```{r} DT::datatable(dat_stand, options = list( bPaginate = FALSE ), style = "bootstrap" ) |> formatStyle( "decision", backgroundColor = styleEqual( c("Admit", "Decline", "Waitlist"), c("#1A6899", "#FC5449", "#FFCF58") ) ) |> formatStyle(c( "gpa", "work_exp", "test_score", "writing_score", "volunteer_level" ), background = styleColorBar(range(c( dat_stand$gpa, dat_stand$work_exp, dat_stand$test_score, dat_stand$writing_score, dat_stand$volunteer_level )), "lightblue"), backgroundSize = "98% 88%", backgroundRepeat = "no-repeat", backgroundPosition = "center" ) ``` ### Cleaned data ```{r} dat <- read_csv("data/data-cleaned.csv") DT::datatable(dat, options = list( bPaginate = FALSE ), style = "bootstrap" ) |> formatStyle( "decision", backgroundColor = styleEqual( c("Admit", "Decline", "Waitlist"), c("#1A6899", "#FC5449", "#FFCF58") ) ) ``` ### Raw data ```{r} DT::datatable(read_csv("data/SummerStudentAdmissions2.csv"), options = list( bPaginate = FALSE ), style = "bootstrap" ) ```